## Nov 16, 2023 | ☐ RISC-V Perf Analysis SIG Meeting

Attendees: Beeman Strong tech.meetings@riscv.org Marc Casas

## **Notes**

- Attendees: Beeman, Guillem, Snehasish, DmitryR, JohnS, RobertC, Chun, BruceA
  - o Guillem: HW engineer from Gaisler
  - o Chun: Individual & ByteDance, formerly AMD and others
- Slides/video here
- Reported on extension progress
- Summit was good for perf analysis, well attended F2F
  - Notable requests: improved sampling, and common trace format for simulators (ala ARM's Tarmac)
    - Should trace fall under DTPM rather than Perf Analysis SIG? Worth checking with lain.
- Performance Events TG
  - DmitryR has volunteered to chair, worked on Vtune before moving to Syntacore
    - Still need an interim VC, speak up if interested
    - Robert: Send request for chairs to newsletter, elsewhere?
  - Plan to define standard events & metrics
  - Will need to define the events for TMA metrics
    - May have custom events that feed into standard metrics? TBD by TG
- Sampling
  - Reviewed gap analysis, comparison with other architectures (PEBS, SPE, etc)
    - Snehasish: SPE samples uops, not instructions
    - Beeman: believe ARM leaves it up to implementations, but may be most often implemented as uops
  - Chun: how to correlate CPU events and uncore/SoC events
    - Trend is lots of cores competing for shared resources
    - Beeman: next priority is standardizing SoC PMUs
    - For associating SoC events with code, may need some bits on the bus that can carry data source or other SoC info, such that some SoC events can be counted in the CPU
  - Chun: ARM has activity monitors, always available/running event counters
    - Would like to be able to push counter (E.g., TMA) values into samples
    - Beeman: first need to standardize events, but then can consider whether to define some dedicated counters for them
    - Beeman: but agreed that we want to be able to push counter values into samples
  - Discussed instruction sampling vs event sampling, tradeoffs
    - Instruction sampling is cheaper to implement and can collect more interesting data per sample, but event sampling is easier to use and better for sampling infrequent events

- o PEBS usage
  - Users typically still take interrupts on each sample, to collect context and call-stack
  - If we want to avoid the need for interrupts, we'll need to figure out how to address these
- Out of time, will continue sampling discussion next meeting

## Action items

| $\checkmark$ | <b>Atish Kumar Pat</b> | a Aug 25, 2022 check on how to read multiple counters in perf     |
|--------------|------------------------|-------------------------------------------------------------------|
|              | when taking an in      | errupt on one                                                     |
|              | See "leader sampling"  |                                                                   |
|              | Beeman Strong          | - Jul 28, 2022 - Reach out about proprietary performance analysis |
|              | tools                  |                                                                   |
|              | Beeman Strong          | - Jul 28, 2022 - Reach out to VMware about PMU enabling           |
| $\checkmark$ | Beeman Strong          | - Jul 28, 2022 - Talk to security HC about counter delegation     |